Relation Extraction with Matrix Factorization and Universal Schemas

نویسندگان

  • Sebastian Riedel
  • Limin Yao
  • Andrew McCallum
  • Benjamin M. Marlin
چکیده

Traditional relation extraction predicts relations within some fixed and finite target schema. Machine learning approaches to this task require either manual annotation or, in the case of distant supervision, existing structured sources of the same schema. The need for existing datasets can be avoided by using a universal schema: the union of all involved schemas (surface form predicates as in OpenIE, and relations in the schemas of preexisting databases). This schema has an almost unlimited set of relations (due to surface forms), and supports integration with existing structured data (through the relation types of existing databases). To populate a database of such schema we present matrix factorization models that learn latent feature vectors for entity tuples and relations. We show that such latent models achieve substantially higher accuracy than a traditional classification approach. More importantly, by operating simultaneously on relations observed in text and in pre-existing structured DBs such as Freebase, we are able to reason about unstructured and structured data in mutually-supporting ways. By doing so our approach outperforms stateof-the-art distant supervision.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards Combined Matrix and Tensor Factorization for Universal Schema Relation Extraction

Matrix factorization of knowledge bases in universal schema has facilitated accurate distantlysupervised relation extraction. This factorization encodes dependencies between textual patterns and structured relations using lowdimensional vectors defined for each entity pair; although these factors are effective at combining evidence for an entity pair, they are inaccurate on rare pairs, or for r...

متن کامل

A Hierarchical Model for Universal Schema Relation Extraction

Relation extraction by universal schema avoids mapping to a brittle, incomplete traditional schema by instead making predictions in the union of all input schemas, including textual patterns. Modeling these predictions by matrix competition with matrix factorization has yielded state-of-the-art accuracies. One difficulty with prior work in matrix factorization, however, is that there is no nega...

متن کامل

Injecting Logical Background Knowledge into Embeddings for Relation Extraction

Matrix factorization approaches to relation extraction provide several attractive features: they support distant supervision, handle open schemas, and leverage unlabeled data. Unfortunately, these methods share a shortcoming with all other distantly supervised approaches: they cannot learn to extract target relations without existing data in the knowledge base, and likewise, these models are in...

متن کامل

Iterative Weighted Non-smooth Non-negative Matrix Factorization for Face Recognition

Non-negative Matrix Factorization (NMF) is a part-based image representation method. It comes from the intuitive idea that entire face image can be constructed by combining several parts. In this paper, we propose a framework for face recognition by finding localized, part-based representations, denoted “Iterative weighted non-smooth non-negative matrix factorization” (IWNS-NMF). A new cost fun...

متن کامل

Applying Universal Schemas for Domain Specific Ontology Expansion

Manually created large scale ontologies are useful for organizing, searching, and repurposing content ranging from scientific papers and medical guidelines to images. However, maintenance of such ontologies is expensive. In this paper, we investigate the use of universal schemas (Riedel et al., 2013) as a mechanism for ontology maintenance. We apply this approach on top of two unique data sourc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013